translated by 谷歌翻译
Image enhancement is a technique that frequently utilized in digital image processing. In recent years, the popularity of learning-based techniques for enhancing the aesthetic performance of photographs has increased. However, the majority of current works do not optimize an image from different frequency domains and typically focus on either pixel-level or global-level enhancements. In this paper, we propose a transformer-based model in the wavelet domain to refine different frequency bands of an image. Our method focuses both on local details and high-level features for enhancement, which can generate superior results. On the basis of comprehensive benchmark evaluations, our method outperforms the state-of-the-art methods.
translated by 谷歌翻译
Shadow removal improves the visual quality and legibility of digital copies of documents. However, document shadow removal remains an unresolved subject. Traditional techniques rely on heuristics that vary from situation to situation. Given the quality and quantity of current public datasets, the majority of neural network models are ill-equipped for this task. In this paper, we propose a Transformer-based model for document shadow removal that utilizes shadow context encoding and decoding in both shadow and shadow-free regions. Additionally, shadow detection and pixel-level enhancement are included in the whole coarse-to-fine process. On the basis of comprehensive benchmark evaluations, it is competitive with state-of-the-art methods.
translated by 谷歌翻译
Proteins play a central role in biology from immune recognition to brain activity. While major advances in machine learning have improved our ability to predict protein structure from sequence, determining protein function from structure remains a major challenge. Here, we introduce Holographic Convolutional Neural Network (H-CNN) for proteins, which is a physically motivated machine learning approach to model amino acid preferences in protein structures. H-CNN reflects physical interactions in a protein structure and recapitulates the functional information stored in evolutionary data. H-CNN accurately predicts the impact of mutations on protein function, including stability and binding of protein complexes. Our interpretable computational model for protein structure-function maps could guide design of novel proteins with desired function.
translated by 谷歌翻译
跨模式哈希是解决大型多媒体检索问题的成功方法。提出了许多基于矩阵分解的哈希方法。但是,现有方法仍然在一些问题上遇到困难,例如如何有效地生成二元代码,而不是直接放松它们的连续性。此外,大多数现有方法选择使用$ n \ times n $相似性矩阵进行优化,这使得内存和计算无法承受。在本文中,我们提出了一种新型的不对称可伸缩式模式哈希(ASCMH)来解决这些问题。首先,它引入了集体矩阵分解,以从不同模态的内核特征中学习一个共同的潜在空间,然后将相似性矩阵优化转换为距距离距离差异问题,并借助语义标签和共同的潜在空间。因此,$ n \ times n $不对称优化的计算复杂性得到了缓解。在一系列哈希码中,我们还采用了标签信息的正交约束,这对于搜索准确性是必不可少的。因此,可以大大减少计算的冗余。为了有效的优化并可扩展到大规模数据集,我们采用了两步方法,而不是同时优化。在三个基准数据集上进行了广泛的实验:Wiki,Mirflickr-25K和NUS范围内,表明我们的ASCMH在准确性和效率方面表现出了最先进的跨模式散列方法。
translated by 谷歌翻译
translated by 谷歌翻译
AVA挑战的目标是提供与可访问性相关的基于视觉的基准和方法。在本文中,我们将提交的技术细节介绍给CVPR2022 AVA挑战赛。首先,我们进行了一些实验,以帮助采用适当的模型和数据增强策略来完成此任务。其次,采用有效的培训策略来提高性能。第三,我们整合了两个不同分割框架的结果,以进一步提高性能。实验结果表明,我们的方法可以在AVA测试集上获得竞争结果。最后,我们的方法在CVPR2022 AVA挑战赛的测试集上实现了63.008 \%ap@0.50:0.95。
translated by 谷歌翻译
Image harmonization task aims at harmonizing different composite foreground regions according to specific background image. Previous methods would rather focus on improving the reconstruction ability of the generator by some internal enhancements such as attention, adaptive normalization and light adjustment, $etc.$. However, they pay less attention to discriminating the foreground and background appearance features within a restricted generator, which becomes a new challenge in image harmonization task. In this paper, we propose a novel image harmonization framework with external style fusion and region-wise contrastive learning scheme. For the external style fusion, we leverage the external background appearance from the encoder as the style reference to generate harmonized foreground in the decoder. This approach enhances the harmonization ability of the decoder by external background guidance. Moreover, for the contrastive learning scheme, we design a region-wise contrastive loss function for image harmonization task. Specifically, we first introduce a straight-forward samples generation method that selects negative samples from the output harmonized foreground region and selects positive samples from the ground-truth background region. Our method attempts to bring together corresponding positive and negative samples by maximizing the mutual information between the foreground and background styles, which desirably makes our harmonization network more robust to discriminate the foreground and background style features when harmonizing composite images. Extensive experiments on the benchmark datasets show that our method can achieve a clear improvement in harmonization quality and demonstrate the good generalization capability in real-scenario applications.
translated by 谷歌翻译
图像协调旨在根据具体背景修改复合区域的颜色。以前的工作模型是使用Unet系列结构的像素-ID映像转换。然而,模型大小和计算成本限制了模型在边缘设备和更高分辨率图像上的能力。为此,我们首次提出了一种新的空间分离曲线渲染网络(S $ ^ 2 $ CRNET),首次进行高效和高分辨率的图像协调。在S $ ^ 2 $ CRNET中,我们首先将屏蔽前景和背景的缩略图中提取空间分离的嵌入物。然后,我们设计一种曲线渲染模块(CRM),其使用线性层学习并结合空间特定知识,以生成前景区域中的方向曲线映射的参数。最后,我们使用学习的颜色曲线直接渲染原始的高分辨率图像。此外,我们还通过Cascaded-CRM和语义CRM分别进行了两个框架的延伸,分别用于级联细化和语义指导。实验表明,与以前的方法相比,该方法降低了90%以上的参数,但仍然达到了合成的iHarmony4和现实世界DIH测试集的最先进的性能。此外,我们的方法可以在0.1秒内在更高分辨率图像(例如,2048美元\ times2048 $)上顺利工作,而不是所有现有方法的GPU计算资源。代码将在\ url {http://github.com/stefanleong/s2crnet}中提供。
translated by 谷歌翻译
translated by 谷歌翻译